SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments

نویسندگان

  • Julian Gough
  • Cyrus Chothia
چکیده

The SUPERFAMILY database contains a library of hidden Markov models representing all proteins of known structure. The database is based on the SCOP 'superfamily' level of protein domain classification which groups together the most distantly related proteins which have a common evolutionary ancestor. There is a public server at http://supfam.org which provides three services: sequence searching, multiple alignments to sequences of known structure, and structural assignments to all complete genomes. Given an amino acid or nucleotide query sequence the server will return the domain architecture and SCOP classification. The server produces alignments of the query sequences with sequences of known structure, and includes multiple alignments of genome and PDB sequences. The structural assignments are carried out on all complete genomes (currently 59) covering approximately half of the soluble protein domains. The assignments, superfamily breakdown and statistics on them are available from the server. The database is currently used by this group and others for genome annotation, structural genomics, gene prediction and domain-based genomic studies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The SUPERFAMILY database in structural genomics.

The SUPERFAMILY hidden Markov model library representing all proteins of known structure predicts the domain architecture of protein sequences and classifies them at the SCOP superfamily level. This analysis has been carried out on all completely sequenced genomes. The ways in which the database can be useful to crystallographers is discussed, in particular with a view to high-throughput struct...

متن کامل

Optimal Hidden Markov Models for All Sequences of Known Structure

Hidden Markov Models (HMMs) are probably the most powerful tool for the detection of protein sequence homology [4]. Maximization of their capabilities and biological usefulness requires the correct interpretation of their scores, and sufficient coverage of the sequence variations that exist in different protein families. Using information available from the SCOP database we investigated optimal...

متن کامل

SUPERFAMILY—sophisticated comparative genomics, data mining, visualization and phylogeny

SUPERFAMILY provides structural, functional and evolutionary information for proteins from all completely sequenced genomes, and large sequence collections such as UniProt. Protein domain assignments for over 900 genomes are included in the database, which can be accessed at http://supfam.org/. Hidden Markov models based on Structural Classification of Proteins (SCOP) domain definitions at the ...

متن کامل

Large-scale comparison of protein sequence alignment algorithms with structure alignments.

Sequence alignment programs such as BLAST and PSI-BLAST are used routinely in pairwise, profile-based, or intermediate-sequence-search (ISS) methods to detect remote homologies for the purposes of fold assignment and comparative modeling. Yet, the sequence alignment quality of these methods at low sequence identity is not known. We have used the CE structure alignment program (Shindyalov and Bo...

متن کامل

A Network of Hidden Markov Models and Its Analysis

The Structural Classification of Proteins (SCOP) database uses a large number of hidden Markov models (HMMs) to represent families and superfamilies composed of proteins that presumably share the same evolutionary origin. However, how the HMMs are related to one another has not been examined before. In this work, taking into account the processes used to build the HMMs, we propose a working hyp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Nucleic acids research

دوره 30 1  شماره 

صفحات  -

تاریخ انتشار 2002